This is the R vignette of a Single-Cell RNA-Seq Transcriptomics project. In this vignette we will start from a count table, with counts from project SRS3044261. Data can be retrieved from https://panglaodb.se/view_data.php?sra=SRA653146&srs=SRS3044261.

The analysis is performed on a dataset of lung cells (protocol: 10x chromium). Single cells were sequenced on the Illumina NovaSeq 6000.

Setup the Seurat Object

library(Seurat)
library(dplyr)
#devtools::install_github("thomasp85/patchwork")
library(patchwork)
library(ggplot2)

Let’s load the count table.

exp.mat <- read.table(file = "SRA653146_SRS3044261.mat", header = TRUE, as.is = TRUE, row.names = 1)

The row names of the table are the “annoying” x_ENSMUSG IDs with the number at the end. Let’s clean them.

ensembl <- gsub("\\_E.*", "", rownames(exp.mat))
ensembl <- make.names(ensembl, unique=TRUE)
rownames(exp.mat) <- ensembl

We create our Seurat object and complete the initialization steps.

lung <- CreateSeuratObject(counts = exp.mat, project = "lungmm", min.cells = 3, min.features = 200)             
Feature names cannot have underscores ('_'), replacing with dashes ('-')
lung
An object of class Seurat 
20730 features across 2977 samples within 1 assay 
Active assay: RNA (20730 features, 0 variable features)

min.cells is the minimum number of cells in which a gene can be detected; min.features is the minimum number of genes that have to be expressed in a cell: genes/cells not satisfying these constraints are discarded a priori. In all the table will contain 2,977 cells and 20,730 “active” genes (“features”) after filtering.

QC and selecting cells for further analysis

The first step is the quality control on cells. We use the set of all genes with name starting with “mt” to identify mitochondrial genes (data are from mouse).

unlist(rownames(exp.mat), use.names = FALSE)[grep("^mt", unlist(rownames(exp.mat), use.names = FALSE))] 
 [1] "mt.Atp6" "mt.Co1"  "mt.Co2"  "mt.Co3"  "mt.Cytb" "mt.Nd1"  "mt.Nd2"  "mt.Nd3"  "mt.Nd4"  "mt.Nd4l" "mt.Nd5"  "mt.Nd6"  "mt.Rnr1" "mt.Rnr2"
[15] "mt.Tc"   "mt.Tn"   "mt.Tp"  

Percentage of counts originating from a set of genes:

lung[["percent.mt"]] <- PercentageFeatureSet(lung, pattern = "^mt")
head(lung@meta.data, 5)

Let’s visualize QC metrics, and use these to filter cells.

VlnPlot(lung, features = c("nFeature_RNA", "nCount_RNA", "percent.mt"), ncol = 3)

FeatureScatters:

plot1 <- FeatureScatter(lung, feature1 = "nCount_RNA", feature2 = "percent.mt")
plot2 <- FeatureScatter(lung, feature1 = "nCount_RNA", feature2 = "nFeature_RNA")
plot1 + plot2

We filter cells that have the number of expressed genes over 4,500 (doublets) or less than 200 (low quality cells/empty droplets), and we filter cells that have > 10% mitochondrial counts (low quality cells).

lung <- subset(lung, subset = nFeature_RNA > 200 & nFeature_RNA < 4500 & percent.mt < 10)

Normalizing the data

After removing unwanted cells from the dataset, the next step is to normalize the data.

lung <- NormalizeData(lung) #default parameters: normalization.method = "LogNormalize", scale.factor = 10000
Performing log-normalization
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|

Identification of highly variable features (feature selection)

The next step is to restrict the gene set to the “most variable” genes: those genes that exhibit the highest cell-to-cell variation in the dataset. By default the top 2000 variable genes are kept for all downstream analyses. Changing this number clearly will change the final results.

lung <- FindVariableFeatures(lung, selection.method = "vst", nfeatures = 2000)
Calculating gene variances
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
# Identify the 10 most highly variable genes
top10 <- head(VariableFeatures(lung), 10)
# plot variable features with and without labels
plot1 <- VariableFeaturePlot(lung)
plot2 <- LabelPoints(plot = plot1, points = top10, repel = TRUE)
When using repel, set xnudge and ynudge to 0 for optimal results
plot1+plot2

Top 10 most variable genes:

top10
 [1] "Lyz2"   "Chil3"  "Cd74"   "Wfdc2"  "Cbr2"   "Sftpc"  "Ccl6"   "Sftpb"  "Sftpa1" "Ear2"  

Scaling the data

All log-normalized counts are transformed so that they have mean 0 and variance 1 across all cells, regardless of the count values (high or low). This in practice correspond to a “binarization” of the data, or rather “ternarization”, where in general for each cell a gene will be tend “up” (>0), “average” (=0) or “down” (<0) with very close values.

all.genes <- rownames(lung)
lung <- ScaleData(lung, features = all.genes)

The list of S and G2M specific genes are already pre-loaded in Seurat, in “cc.genes”. We can segregate this list into markers of G2/M phase and markers of S phase.

s.genes <- stringr::str_to_title(cc.genes$s.genes) # stringr::str_to_title(): fro ex. from "PCNA" to "Pcna" since we are working on Mm genes
g2m.genes <- stringr::str_to_title(cc.genes$g2m.genes)
lung <- CellCycleScoring(lung, s.features = s.genes, g2m.features = g2m.genes, set.ident = TRUE)
The following features are not present in the object: Fen1, Mlf1ip, not searching for symbol synonyms

View cell cycle scores and phase assignments.

head(lung[[]])

Running a PCA on cell cycle genes reveals that cells do not separate entirely by phase -> no need to regress them out.

lung <- RunPCA(lung, features = c(s.genes, g2m.genes))
The following 2 features requested have not been scaled (running reduction without them): Fen1, Mlf1ipYou're computing too large a percentage of total singular values, use a standard svd instead.PC_ 1 
Positive:  Birc5, Top2a, Mki67, Cdk1, Cdca3, Tacc3, Kif23, Kif11, Ube2c, Uhrf1 
       Tpx2, Rrm2, Hmmr, Ccnb2, Aurkb, Cdca8, Nusap1, Dlgap5, Cks2, Cenpe 
       Cenpa, Ncapd2, Ckap2l, Rad51ap1, Hmgb2, Kif20b, Ckap2, Cks1b, Ect2, Kif2c 
Negative:  Msh2, Cbx5, Ccne2, Ttk, Brip1, Ubr7, Cdc6, Hjurp, Slbp, Pold3 
       Cdc25c, Casp8ap2, Exo1, Chaf1b, Cdca7, Mcm4, Rpa2, Ckap5, G2e3, Wdr76 
       Ctcf, Tipin, Ung, Tubb4b, Blm, Usp1, Nasp, Lbr, Cdc45, Psrc1 
PC_ 2 
Positive:  Mcm6, Mcm5, Pcna, Hells, Hn1, Hmgb2, Rfc2, Ung, Dtl, Mcm4 
       Atad2, Gmnn, Nasp, Mcm2, Uhrf1, Pola1, Slbp, Cdca7, Smc4, Rpa2 
       Prim1, Casp8ap2, Lbr, Chaf1b, Usp1, Rrm1, Cdc45, Blm, Rrm2, Wdr76 
Negative:  Ccnb2, Cdca3, Kif2c, Ube2c, Ckap2l, Cenpa, Cdc20, Cenpf, Hmmr, Nek2 
       Tpx2, Cenpe, Fam64a, Ect2, Birc5, Kif23, Nusap1, Psrc1, Aurka, Cdca2 
       G2e3, Cbx5, Ckap2, Dlgap5, Aurkb, Bub1, Cks2, Cks1b, Gas2l3, Tacc3 
PC_ 3 
Positive:  Cbx5, Uhrf1, Rrm1, Tyms, Cdk1, Hells, Rrm2, Hjurp, Dtl, E2f8 
       Atad2, Cdc6, Tipin, Cdca7, Top2a, Ncapd2, Cks1b, Cdc45, Clspn, Pola1 
       Kif11, Mcm5, Mcm2, Rad51, Gins2, Blm, Cdca8, Dscc1, Tacc3, Mki67 
Negative:  Hmgb2, Hn1, Cks2, Lbr, Rfc2, Smc4, Anp32e, Mcm6, Cenpa, Rpa2 
       Rangap1, Slbp, Cdc20, Ubr7, Msh2, Tubb4b, Kif2c, Nek2, Fam64a, Ctcf 
       G2e3, Prim1, Cenpf, Ect2, Cdca3, Kif20b, Ccnb2, Pold3, Casp8ap2, Pcna 
PC_ 4 
Positive:  Tubb4b, Ckap5, Cbx5, Tmpo, Ctcf, Usp1, Casp8ap2, Pold3, Cks1b, Hjurp 
       G2e3, Pola1, Cdc6, Aurka, Gmnn, Ubr7, Anp32e, Nasp, Pcna, Tipin 
       Kif20b, Mcm5, Cdc20, Atad2, Ccnb2, Hn1, Psrc1, Cdca8, Nek2, Rrm1 
Negative:  Ncapd2, Mcm4, Slbp, Uhrf1, E2f8, Hmgb2, Rrm2, Cdca7, Tpx2, Smc4 
       Gins2, Cdk1, Mki67, Aurkb, Dtl, Cks2, Cdca3, Clspn, Birc5, Kif11 
       Top2a, Brip1, Rad51, Wdr76, Gas2l3, Tacc3, Nuf2, Ung, Hmmr, Cdc45 
PC_ 5 
Positive:  Pold3, Mcm2, Wdr76, Hjurp, Pola1, Gmnn, Blm, Rangap1, Tyms, Smc4 
       Rpa2, Anln, Casp8ap2, Lbr, Dlgap5, Rad51ap1, Mcm5, Hells, Hn1, Dscc1 
       Rrm2, Mki67, Cenpa, Rfc2, Cdca8, Ccnb2, Brip1, Ckap2, Top2a, Tubb4b 
Negative:  Mcm4, Prim1, Ung, Cbx5, Cdc45, Cdca7, Cdc6, Ctcf, Nasp, G2e3 
       Slbp, Usp1, Mcm6, Tipin, Gins2, Ckap5, Dtl, Rrm1, Aurka, Cks2 
       Msh2, Gas2l3, Hmgb2, Tmpo, Psrc1, Cks1b, Hmmr, Clspn, Tacc3, Uhrf1 
DimPlot(lung)

Perform linear dimensional reduction

The next step is the dimensionality reduction, on scaled counts and on the 2000 most variable genes extracted before, with PCA.

lung <- RunPCA(lung, features = VariableFeatures(object = lung))
PC_ 1 
Positive:  Igfbp7, Sparc, Mgp, Col3a1, Col1a1, Col1a2, Timp3, Apoe, Nbl1, Inmt 
       Mfap5, Nupr1, Clec3b, Tmem176a, Id3, Fbln1, Timp2, Cebpd, Mfap4, Cfh 
       Igfbp6, Lrp4, Rarres2, Mylk, Eln, Fstl1, Jun, Cyr61, Tcf21, Gas6 
Negative:  Fcer1g, Tyrobp, Cd52, Rac2, Coro1a, Laptm5, Arhgdib, Nkg7, Ptpn18, Hcst 
       Selplg, Cd53, Ptprc, Gzma, Ptprcap, Klrd1, Ccl5, Ms4a4b, Klre1, Klrb1c 
       Fxyd5, Klrk1, Ncr1, Itgb2, RP23.52N2.1, Sh3bgrl3, Prf1, Arhgap45, Gimap4, Ctsw 
PC_ 2 
Positive:  Dynlrb2, RP23.100C7.3, Fam183b, RP23.96I9.2, Sec14l3, Tmem212, RP23.167L1.8, Ccdc153, AC132584.3, Arhgdig 
       RP24.73K7.1, RP23.430H1.2, Lrrc10b, RP23.281F13.4, RP24.160I20.1, Cfap126, Erich2, Meig1, Cyp2f2, Krt8 
       Rsph1, Tctex1d4, Cldn3, Cyp2s1, Cbr2, Cxcl17, Mlf1, Smim22, AC157761.1, Efcab10 
Negative:  Mgp, Inmt, Igfbp7, Sparc, Apoe, Col1a1, Col3a1, Col1a2, Mfap4, Mfap5 
       Cxcl14, Clec3b, Lrp4, Cdo1, Timp2, Gas7, G0s2, Fbln1, Tcf21, Cebpd 
       Mylk, Npnt, Nbl1, Cfh, AC161169.2, Tubb5, Cyr61, Eln, Timp3, Fstl1 
PC_ 3 
Positive:  Ear2, Mpeg1, Chil3, Cybb, Ctss, Mcemp1, Ccl6, Clec7a, Alox5ap, Ltc4s 
       Ear1, Lyz2, Mrc1, Atp6v0d2, Il1rn, Plet1, Cd68, Cd300d, Spi1, Krt79 
       Bcl2a1a, Clec4n, Fabp1, Olr1, Csf2rb, Nceh1, Sirpb1c, Sirpa, Klra2, Wfdc21 
Negative:  Nkg7, Gzma, Klre1, Klrb1c, Ncr1, Klrk1, Ms4a4b, Klrd1, Prf1, Ccl5 
       Ptprcap, Il2rb, Gimap4, Gzmb, Ctsw, RP23.52N2.1, Klra13.ps, Klra9, Cma1, Sept1 
       Lck, Klra8, Klrc2, H2.Q7, Serpinb9, Cst7, Klra11.ps, Klri2, Klra4, Il18r1 
PC_ 4 
Positive:  S100a6, Igfbp6, Clec3b, Cygb, Cfh, Dcn, Mfap5, Igfbp7, Ly6a, Adamts2 
       C3, Mgp, Cd34, Fbln1, Pi16, Gas1, Timp3, Serpinf1, Col14a1, Timp2 
       Col1a2, Cxcl12, Aebp1, Gpc3, Gas7, Igfbp4, Scara5, Htra3, Mmp2, Serpina3n 
Negative:  Cox4i2, Higd1b, Postn, RP23.250M11.8, Fam162b, Gucy1b3, Vtn, Lipg, Gucy1a3, Itga1 
       Notch3, Hbegf, Ndufa4l2, Vsnl1, Tmem178, Emid1, Gm26580, Art3, Rgcc, Trpc6 
       Adcy8, Pde5a, Agtr1a, Pdzd2, Lmcd1, Gja4, Gap43, Ltbp2, Tusc5, Mcam 
PC_ 5 
Positive:  Npnt, Cdo1, Inmt, Cxcl14, G0s2, Gja5, Wisp2, Slc7a10, Ces1d, AC167245.5 
       Mfap4, Lrat, Tcf21, Lrp4, Tmem176a, Colq, Mylk, AC161169.2, Apoe, Hspb1 
       Cp, Prdx6, Gadd45b, Gsta3, Vcam1, Trf, RP23.189B9.5, Plin2, Cpa3, RP23.375N21.2 
Negative:  Dcn, Meg3, Ahnak2, Ly6a, Pi16, Col14a1, Aldh1a2, Crip1, Prss23, Scara5 
       Il33, Entpd2, Igfbp4, Serpinf1, Rspo1, Aqp1, Ebf1, Slco2a1, Cd248, Gpm6a 
       Ly6c1, Upk3b, RP23.162H6.1, Cpe, Anxa3, Ccl11, Nkain4, Msln, Aebp1, Lrrn4 
print(lung[["pca"]], dims = 1:5, nfeatures = 5)
PC_ 1 
Positive:  Igfbp7, Sparc, Mgp, Col3a1, Col1a1 
Negative:  Fcer1g, Tyrobp, Cd52, Rac2, Coro1a 
PC_ 2 
Positive:  Dynlrb2, RP23.100C7.3, Fam183b, RP23.96I9.2, Sec14l3 
Negative:  Mgp, Inmt, Igfbp7, Sparc, Apoe 
PC_ 3 
Positive:  Ear2, Mpeg1, Chil3, Cybb, Ctss 
Negative:  Nkg7, Gzma, Klre1, Klrb1c, Ncr1 
PC_ 4 
Positive:  S100a6, Igfbp6, Clec3b, Cygb, Cfh 
Negative:  Cox4i2, Higd1b, Postn, RP23.250M11.8, Fam162b 
PC_ 5 
Positive:  Npnt, Cdo1, Inmt, Cxcl14, G0s2 
Negative:  Dcn, Meg3, Ahnak2, Ly6a, Pi16 
VizDimLoadings(lung, dims = 1:2, reduction = "pca")

We visualize the results on the first two PC.

DimPlot(lung, reduction = "pca")

We can plot the data projected on any of the PCA dimensions.

DimPlot(lung, reduction = "pca", dims = c(3,4))

Heatmap of the most variable genes on the most variable cells in the first PC:

DimHeatmap(lung, dims = 1, cells = 500, balanced = TRUE)

There are several genes with a clear difference of expression across the most variable cells, splitting them into two main groups. The same can be done for the other PCs.

DimHeatmap(lung, dims = 1:9, cells = 500, balanced = TRUE)

Determine the “dimensionality” of the dataset

Next step is to determine the “dimensionality” of the dataset. “Significant” PCs in JackStrawPlot() will show a strong enrichment of features with low p-values (solid curve above the dashed line).

lung <- JackStraw(lung, num.replicate = 100)
lung <- ScoreJackStraw(lung, dims = 1:20)
JackStrawPlot(lung, dims = 1:20)

An alternative heuristic method generates an “Elbow plot”. Here we can observe an “elbow” around PC20, suggesting that the majority of true signal is captured in the first 20 PCs.

ElbowPlot(lung)

Cluster the cells

Setting resolution between 0.4-1.2 typically returns good results for single-cell datasets of around 3K cells. Optimal resolution often increases for larger datasets. The clusters can be found using the Idents function.

lung <- FindNeighbors(lung, dims = 1:20)
Computing nearest neighbor graph
Computing SNN
lung <- FindClusters(lung, resolution = 0.5)
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 2935
Number of edges: 101819

Running Louvain algorithm...
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.9060
Number of communities: 14
Elapsed time: 0 seconds
head(Idents(lung), 5)
AAACCTGAGACAAAGG AAACCTGAGACTAAGT AAACCTGAGCGTCTAT AAACCTGAGCTGGAAC AAACCTGCATCGGTTA 
               0                0                3                0                4 
Levels: 0 1 2 3 4 5 6 7 8 9 10 11 12 13

Now the PCA plot is labeled with clusters.

DimPlot(lung, reduction = "pca")

Run non-linear dimensional reduction

UMAP:

#reticulate::py_install(packages ='umap-learn')
lung <- RunUMAP(lung, dims = 1:20)
The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session15:15:57 UMAP embedding parameters a = 0.9922 b = 1.112
15:15:57 Read 2935 rows and found 20 numeric columns
15:15:57 Using Annoy for neighbor search, n_neighbors = 30
15:15:57 Building Annoy index with metric = cosine, n_trees = 50
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:15:57 Writing NN index file to temp file /var/folders/4k/0lqzcm0545d30b5nyr_dptyh0000gn/T//RtmpO3l2gf/file3a615425e71
15:15:57 Searching Annoy index using 1 thread, search_k = 3000
15:15:58 Annoy recall = 100%
15:16:00 Commencing smooth kNN distance calibration using 1 thread
15:16:03 Initializing from normalized Laplacian + noise
15:16:03 Commencing optimization for 500 epochs, with 118972 positive edges
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
15:16:08 Optimization finished
DimPlot(lung, reduction = "umap")

We visualize here a table containing (identified by number) the number of cells belonging to each cluster, useful to be compared to the “cluster table” on PanglaoDB (https://panglaodb.se/list_clusters_and_cell_types.html?sra=SRA653146&srs=SRS3044261).

table(Idents(lung))

  0   1   2   3   4   5   6   7   8   9  10  11  12  13 
676 527 414 350 343 270 105  77  59  29  27  20  19  19 

Let’s save the object.

saveRDS(lung, file = "lung.rds")

Finding differentially expressed features (cluster biomarkers)

The final step is to give an “identity” to the clusters. That is, find which are the “marker genes” (expressed exclusively, or at least over-expressed) in each cluster with respect to the others. Then, trying to figure out, according to the marker genes of each cluster, what could be the corresponding cell type.

We can find the “differentially expressed genes” for a cluster against the others, with the additional condition that the gene has to be expressed in at least the 25% of the cells in the cluster:

cluster1.markers <- FindMarkers(lung, ident.1 = 1, min.pct = 0.25)
head(cluster1.markers, n = 5)

Or choose to compare explicitly two or more clusters. Here cluster 5 is compared to clusters 0 and 3.

cluster5.markers <- FindMarkers(lung, ident.1 = 5, ident.2 = c(0, 3), min.pct = 0.25)
head(cluster5.markers, n = 5)

Or, find the markers of all clusters, with respect to the others (this will take longer). The results are stored in a table with two genes for each cluster. Here the criteria are a) expressed in at least 25% of the cells and b) with a log2 fold change of at least 0.25.

lung.markers <- FindAllMarkers(lung, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25)
lung.markers %>% group_by(cluster) %>% top_n(n = 2, wt = avg_log2FC)

There are several tools for visualizing marker expression. VlnPlot() (shows expression probability distributions across clusters), and FeaturePlot() are most commonly used visualizations. There are also RidgePlot() and DotPlot() as additional methods to view our dataset.

# Violin plot of the expression of the best "marker gene" for each cluster:
# VlnPlot(lung, features = features = c("Slc7a10", "Gzma", "Rbp1", "Pi16", "Vcam1", "Mustn1", "Cox4i2", "Ear2", "Ms4a6c", "Igkc", "Sftpa1", "Ak7", "Aqp1", "Rac2")) splitted into (for better visualization):
VlnPlot(lung, features = c("Slc7a10", "Gzma", "Rbp1"))

VlnPlot(lung, features = c("Pi16", "Vcam1", "Mustn1"))

VlnPlot(lung, features = c("Cox4i2", "Ear2", "Ms4a6c")) 

VlnPlot(lung, features = c("Igkc", "Sftpa1", "Ak7")) 

VlnPlot(lung, features = "Aqp1") 

VlnPlot(lung, features = "Rac2") # Unknown

#FeaturePlot(lung, features = c("Slc7a10", "Gata3", "Gzma", "Ccl5", "Ifng", "Rbp1", "Pi16", "Vcam1", "Mustn1", "Acta2", "Cox4i2", "Higd1b", "Chil3", "Ear1", "Ear2", "Ms4a6c", "Ccr7", "Igkc", "Sftpa1", "Sftpc", "Sfta2", "Egfl6", "Lamp3", "Ak7", "Aqp1", "Bcam", "Rac2", "Atf3")) splitted into (for better visualization):
FeaturePlot(lung, features = c("Slc7a10", "Gata3", "Gzma", "Ccl5"))

FeaturePlot(lung, features = c("Ifng", "Rbp1", "Pi16", "Vcam1"))

FeaturePlot(lung, features = c("Mustn1", "Acta2", "Cox4i2", "Higd1b"))

FeaturePlot(lung, features = c("Chil3", "Ear1", "Ear2", "Ms4a6c"))

FeaturePlot(lung, features = c("Ccr7", "Igkc", "Sftpa1", "Sftpc"))

FeaturePlot(lung, features = c("Sfta2", "Egfl6", "Lamp3", "Ak7"))

FeaturePlot(lung, features = c("Aqp1", "Bcam", "Rac2", "Atf3"))

RidgePlot(lung, features = c("Gzma", "Ear2"), ncol = 2) # Ridge plots from ggridges. Visualize single cell expression distributions in each cluster

DotPlot(lung, features = c("Slc7a10", "Gata3", "Gzma", "Ccl5", "Ifng", "Rbp1", "Pi16", "Vcam1", "Mustn1", "Acta2", "Cox4i2", "Higd1b", "Chil3", "Ear1", "Ear2", "Ms4a6c", "Ccr7", "Igkc", "Sftpa1", "Sftpc", "Sfta2", "Egfl6", "Lamp3", "Ak7", "Aqp1", "Bcam", "Rac2", "Atf3", "Irf8")) + RotatedAxis() # the size of the dot corresponds to the percentage of cells expressing the feature in each cluster. The color represents the average expression level

DoHeatmap() generates an expression heatmap for given cells and features. In this case, we are plotting the top 10 markers for each cluster.

lung.markers %>%
    group_by(cluster) %>%
    top_n(n = 10, wt = avg_log2FC) -> top10
DoHeatmap(lung, features = top10$gene) + NoLegend()

Assigning cell type identity to clusters

new.cluster.ids <- c("Fibroblasts", "NK", "Fibroblasts", "Fibroblasts", "Fibroblasts", "SMCs", "Pericytes", "Alveolar m.", "Macrophages", "B", "PneumocytesII", "Ependymal", "Endothelial", "Unknown")
names(new.cluster.ids) <- levels(lung)
lung <- RenameIdents(lung, new.cluster.ids)
DimPlot(lung, reduction = "umap", label = TRUE, pt.size = 0.5) + NoLegend()

saveRDS(lung, file = "lung_final.rds")

Single cell heatmap showing the expression of best “marker gene” for each cluster overlayed on the UMAP projection of the cells.

DoHeatmap(subset(lung, downsample = 100), features = c("Slc7a10", "Gzma", "Rbp1", "Pi16", "Vcam1", "Mustn1", "Cox4i2", "Ear2", "Ms4a6c", "Igkc", "Sftpa1", "Ak7", "Aqp1", "Rac2"), size = 3)

sessionInfo()

This analysis was conducted on:

sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] it_IT.UTF-8/it_IT.UTF-8/it_IT.UTF-8/C/it_IT.UTF-8/it_IT.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggplot2_3.3.5               patchwork_1.1.1             dplyr_1.0.7                 SeuratObject_4.0.2          Seurat_4.0.3               
 [6] SummarizedExperiment_1.20.0 Biobase_2.50.0              GenomicRanges_1.42.0        GenomeInfoDb_1.26.7         IRanges_2.24.1             
[11] S4Vectors_0.28.1            BiocGenerics_0.36.1         MatrixGenerics_1.2.1        matrixStats_0.59.0          edgeR_3.32.1               
[16] limma_3.46.0               

loaded via a namespace (and not attached):
  [1] Rtsne_0.15             colorspace_2.0-2       deldir_0.2-10          ellipsis_0.3.2         ggridges_0.5.3         XVector_0.30.0        
  [7] rstudioapi_0.13        spatstat.data_2.1-0    leiden_0.3.8           listenv_0.8.0          ggrepel_0.9.1          fansi_0.5.0           
 [13] codetools_0.2-18       splines_4.0.4          knitr_1.33             polyclip_1.10-0        jsonlite_1.7.2         ica_1.0-2             
 [19] cluster_2.1.2          png_0.1-7              uwot_0.1.10            shiny_1.6.0            sctransform_0.3.2      spatstat.sparse_2.0-0 
 [25] compiler_4.0.4         httr_1.4.2             assertthat_0.2.1       Matrix_1.3-4           fastmap_1.1.0          lazyeval_0.2.2        
 [31] cli_3.0.0              later_1.2.0            htmltools_0.5.1.1      tools_4.0.4            igraph_1.2.6           gtable_0.3.0          
 [37] glue_1.4.2             GenomeInfoDbData_1.2.4 RANN_2.6.1             reshape2_1.4.4         Rcpp_1.0.6             scattermore_0.7       
 [43] vctrs_0.3.8            nlme_3.1-152           lmtest_0.9-38          xfun_0.24              stringr_1.4.0          globals_0.14.0        
 [49] mime_0.11              miniUI_0.1.1.1         lifecycle_1.0.0        irlba_2.3.3            goftest_1.2-2          future_1.21.0         
 [55] zlibbioc_1.36.0        MASS_7.3-54            zoo_1.8-9              scales_1.1.1           spatstat.core_2.2-0    promises_1.2.0.1      
 [61] spatstat.utils_2.2-0   RColorBrewer_1.1-2     yaml_2.2.1             reticulate_1.20        pbapply_1.4-3          gridExtra_2.3         
 [67] rpart_4.1-15           stringi_1.6.2          rlang_0.4.11           pkgconfig_2.0.3        bitops_1.0-7           evaluate_0.14         
 [73] lattice_0.20-44        ROCR_1.0-11            purrr_0.3.4            tensor_1.5             htmlwidgets_1.5.3      cowplot_1.1.1         
 [79] tidyselect_1.1.1       parallelly_1.26.1      RcppAnnoy_0.0.18       plyr_1.8.6             magrittr_2.0.1         R6_2.5.0              
 [85] generics_0.1.0         DelayedArray_0.16.3    DBI_1.1.1              withr_2.4.2            pillar_1.6.1           mgcv_1.8-36           
 [91] fitdistrplus_1.1-5     survival_3.2-11        abind_1.4-5            RCurl_1.98-1.3         tibble_3.1.2           future.apply_1.7.0    
 [97] crayon_1.4.1           KernSmooth_2.23-20     utf8_1.2.1             spatstat.geom_2.2-0    plotly_4.9.4.1         rmarkdown_2.9         
[103] locfit_1.5-9.4         grid_4.0.4             data.table_1.14.0      digest_0.6.27          xtable_1.8-4           tidyr_1.1.3           
[109] httpuv_1.6.1           munsell_0.5.0          viridisLite_0.4.0     

Bibliography

Hao, Hao, et al., Cell 2021; Seurat V4; https://satijalab.org/seurat/

Giulio Pavesi; Seurat example on 10x Data (2021 update); http://159.149.160.56/Transcriptomics/seurat.html

LS0tCnRpdGxlOiAiU2luZ2xlLUNlbGwgUk5BLVNlcSBUcmFuc2NyaXB0b21pY3MgcHJvamVjdCIKb3V0cHV0OiBodG1sX25vdGVib29rCmF1dGhvcjogRW1hbnVlbGUgQ2F2YWxsZXJpIAotLS0KClRoaXMgaXMgdGhlIFIgdmlnbmV0dGUgb2YgYSBTaW5nbGUtQ2VsbCBSTkEtU2VxIFRyYW5zY3JpcHRvbWljcyBwcm9qZWN0LiBJbiB0aGlzIHZpZ25ldHRlIHdlIHdpbGwgc3RhcnQgZnJvbSBhIGNvdW50IHRhYmxlLCB3aXRoIGNvdW50cyBmcm9tIHByb2plY3QgU1JTMzA0NDI2MS4gRGF0YSBjYW4gYmUgcmV0cmlldmVkIGZyb20gaHR0cHM6Ly9wYW5nbGFvZGIuc2Uvdmlld19kYXRhLnBocD9zcmE9U1JBNjUzMTQ2JnNycz1TUlMzMDQ0MjYxLgoKVGhlIGFuYWx5c2lzIGlzIHBlcmZvcm1lZCBvbiBhIGRhdGFzZXQgb2YgbHVuZyBjZWxscyAocHJvdG9jb2w6IDEweCBjaHJvbWl1bSkuIFNpbmdsZSBjZWxscyB3ZXJlIHNlcXVlbmNlZCBvbiB0aGUgSWxsdW1pbmEgTm92YVNlcSA2MDAwLiAKCiMgU2V0dXAgdGhlIFNldXJhdCBPYmplY3QKCmBgYHtyfQpsaWJyYXJ5KFNldXJhdCkKbGlicmFyeShkcGx5cikKI2RldnRvb2xzOjppbnN0YWxsX2dpdGh1YigidGhvbWFzcDg1L3BhdGNod29yayIpCmxpYnJhcnkocGF0Y2h3b3JrKQpsaWJyYXJ5KGdncGxvdDIpCmBgYAoKTGV0J3MgbG9hZCB0aGUgY291bnQgdGFibGUuCmBgYHtyfQpleHAubWF0IDwtIHJlYWQudGFibGUoZmlsZSA9ICJTUkE2NTMxNDZfU1JTMzA0NDI2MS5tYXQiLCBoZWFkZXIgPSBUUlVFLCBhcy5pcyA9IFRSVUUsIHJvdy5uYW1lcyA9IDEpCmBgYAoKVGhlIHJvdyBuYW1lcyBvZiB0aGUgdGFibGUgYXJlIHRoZSAiYW5ub3lpbmciIHhfRU5TTVVTRyBJRHMgd2l0aCB0aGUgbnVtYmVyIGF0IHRoZSBlbmQuIExldOKAmXMgY2xlYW4gdGhlbS4gCmBgYHtyfQplbnNlbWJsIDwtIGdzdWIoIlxcX0UuKiIsICIiLCByb3duYW1lcyhleHAubWF0KSkKZW5zZW1ibCA8LSBtYWtlLm5hbWVzKGVuc2VtYmwsIHVuaXF1ZT1UUlVFKQpyb3duYW1lcyhleHAubWF0KSA8LSBlbnNlbWJsCmBgYAoKV2UgY3JlYXRlIG91ciBTZXVyYXQgb2JqZWN0IGFuZCBjb21wbGV0ZSB0aGUgaW5pdGlhbGl6YXRpb24gc3RlcHMuCmBgYHtyfQpsdW5nIDwtIENyZWF0ZVNldXJhdE9iamVjdChjb3VudHMgPSBleHAubWF0LCBwcm9qZWN0ID0gImx1bmdtbSIsIG1pbi5jZWxscyA9IDMsIG1pbi5mZWF0dXJlcyA9IDIwMCkJCQkJCmx1bmcKYGBgCgptaW4uY2VsbHMgaXMgdGhlIG1pbmltdW0gbnVtYmVyIG9mIGNlbGxzIGluIHdoaWNoIGEgZ2VuZSBjYW4gYmUgZGV0ZWN0ZWQ7IG1pbi5mZWF0dXJlcyBpcyB0aGUgbWluaW11bSBudW1iZXIgb2YgZ2VuZXMgdGhhdCBoYXZlIHRvIGJlIGV4cHJlc3NlZCBpbiBhIGNlbGw6IGdlbmVzL2NlbGxzIG5vdCBzYXRpc2Z5aW5nIHRoZXNlIGNvbnN0cmFpbnRzIGFyZSBkaXNjYXJkZWQgYSBwcmlvcmkuIEluIGFsbCB0aGUgdGFibGUgd2lsbCBjb250YWluIDIsOTc3IGNlbGxzIGFuZCAyMCw3MzAgImFjdGl2ZSIgZ2VuZXMgKCJmZWF0dXJlcyIpIGFmdGVyIGZpbHRlcmluZy4gCgojIFFDIGFuZCBzZWxlY3RpbmcgY2VsbHMgZm9yIGZ1cnRoZXIgYW5hbHlzaXMKClRoZSBmaXJzdCBzdGVwIGlzIHRoZSBxdWFsaXR5IGNvbnRyb2wgb24gY2VsbHMuIFdlIHVzZSB0aGUgc2V0IG9mIGFsbCBnZW5lcyB3aXRoIG5hbWUgc3RhcnRpbmcgd2l0aCAibXQiIHRvIGlkZW50aWZ5IG1pdG9jaG9uZHJpYWwgZ2VuZXMgKGRhdGEgYXJlIGZyb20gbW91c2UpLgpgYGB7cn0KdW5saXN0KHJvd25hbWVzKGV4cC5tYXQpLCB1c2UubmFtZXMgPSBGQUxTRSlbZ3JlcCgiXm10IiwgdW5saXN0KHJvd25hbWVzKGV4cC5tYXQpLCB1c2UubmFtZXMgPSBGQUxTRSkpXSAKYGBgCgpQZXJjZW50YWdlIG9mIGNvdW50cyBvcmlnaW5hdGluZyBmcm9tIGEgc2V0IG9mIGdlbmVzOgpgYGB7cn0KbHVuZ1tbInBlcmNlbnQubXQiXV0gPC0gUGVyY2VudGFnZUZlYXR1cmVTZXQobHVuZywgcGF0dGVybiA9ICJebXQiKQpoZWFkKGx1bmdAbWV0YS5kYXRhLCA1KQpgYGAKCkxldCdzIHZpc3VhbGl6ZSBRQyBtZXRyaWNzLCBhbmQgdXNlIHRoZXNlIHRvIGZpbHRlciBjZWxscy4KYGBge3J9ClZsblBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJuRmVhdHVyZV9STkEiLCAibkNvdW50X1JOQSIsICJwZXJjZW50Lm10IiksIG5jb2wgPSAzKQpgYGAKCkZlYXR1cmVTY2F0dGVyczoKYGBge3J9CnBsb3QxIDwtIEZlYXR1cmVTY2F0dGVyKGx1bmcsIGZlYXR1cmUxID0gIm5Db3VudF9STkEiLCBmZWF0dXJlMiA9ICJwZXJjZW50Lm10IikKcGxvdDIgPC0gRmVhdHVyZVNjYXR0ZXIobHVuZywgZmVhdHVyZTEgPSAibkNvdW50X1JOQSIsIGZlYXR1cmUyID0gIm5GZWF0dXJlX1JOQSIpCnBsb3QxICsgcGxvdDIKYGBgCgpXZSBmaWx0ZXIgY2VsbHMgdGhhdCBoYXZlIHRoZSBudW1iZXIgb2YgZXhwcmVzc2VkIGdlbmVzIG92ZXIgNCw1MDAgKGRvdWJsZXRzKSBvciBsZXNzIHRoYW4gMjAwIChsb3cgcXVhbGl0eSBjZWxscy9lbXB0eSBkcm9wbGV0cyksIGFuZCB3ZSBmaWx0ZXIgY2VsbHMgdGhhdCBoYXZlID4gMTAlIG1pdG9jaG9uZHJpYWwgY291bnRzIChsb3cgcXVhbGl0eSBjZWxscykuCmBgYHtyfQpsdW5nIDwtIHN1YnNldChsdW5nLCBzdWJzZXQgPSBuRmVhdHVyZV9STkEgPiAyMDAgJiBuRmVhdHVyZV9STkEgPCA0NTAwICYgcGVyY2VudC5tdCA8IDEwKQpgYGAKCiMgTm9ybWFsaXppbmcgdGhlIGRhdGEKCkFmdGVyIHJlbW92aW5nIHVud2FudGVkIGNlbGxzIGZyb20gdGhlIGRhdGFzZXQsIHRoZSBuZXh0IHN0ZXAgaXMgdG8gbm9ybWFsaXplIHRoZSBkYXRhLgpgYGB7cn0KbHVuZyA8LSBOb3JtYWxpemVEYXRhKGx1bmcpICNkZWZhdWx0IHBhcmFtZXRlcnM6IG5vcm1hbGl6YXRpb24ubWV0aG9kID0gIkxvZ05vcm1hbGl6ZSIsIHNjYWxlLmZhY3RvciA9IDEwMDAwCmBgYAoKIyBJZGVudGlmaWNhdGlvbiBvZiBoaWdobHkgdmFyaWFibGUgZmVhdHVyZXMgKGZlYXR1cmUgc2VsZWN0aW9uKQoKVGhlIG5leHQgc3RlcCBpcyB0byByZXN0cmljdCB0aGUgZ2VuZSBzZXQgdG8gdGhlICJtb3N0IHZhcmlhYmxlIiBnZW5lczogdGhvc2UgZ2VuZXMgdGhhdCBleGhpYml0IHRoZSBoaWdoZXN0IGNlbGwtdG8tY2VsbCB2YXJpYXRpb24gaW4gdGhlIGRhdGFzZXQuIEJ5IGRlZmF1bHQgdGhlIHRvcCAyMDAwIHZhcmlhYmxlIGdlbmVzIGFyZSBrZXB0IGZvciBhbGwgZG93bnN0cmVhbSBhbmFseXNlcy4gQ2hhbmdpbmcgdGhpcyBudW1iZXIgY2xlYXJseSB3aWxsIGNoYW5nZSB0aGUgZmluYWwgcmVzdWx0cy4KYGBge3J9Cmx1bmcgPC0gRmluZFZhcmlhYmxlRmVhdHVyZXMobHVuZywgc2VsZWN0aW9uLm1ldGhvZCA9ICJ2c3QiLCBuZmVhdHVyZXMgPSAyMDAwKQojIElkZW50aWZ5IHRoZSAxMCBtb3N0IGhpZ2hseSB2YXJpYWJsZSBnZW5lcwp0b3AxMCA8LSBoZWFkKFZhcmlhYmxlRmVhdHVyZXMobHVuZyksIDEwKQojIHBsb3QgdmFyaWFibGUgZmVhdHVyZXMgd2l0aCBhbmQgd2l0aG91dCBsYWJlbHMKcGxvdDEgPC0gVmFyaWFibGVGZWF0dXJlUGxvdChsdW5nKQpwbG90MiA8LSBMYWJlbFBvaW50cyhwbG90ID0gcGxvdDEsIHBvaW50cyA9IHRvcDEwLCByZXBlbCA9IFRSVUUpCnBsb3QxK3Bsb3QyCmBgYAoKVG9wIDEwIG1vc3QgdmFyaWFibGUgZ2VuZXM6CmBgYHtyfQp0b3AxMApgYGAKCiMgU2NhbGluZyB0aGUgZGF0YQoKQWxsIGxvZy1ub3JtYWxpemVkIGNvdW50cyBhcmUgdHJhbnNmb3JtZWQgc28gdGhhdCB0aGV5IGhhdmUgbWVhbiAwIGFuZCB2YXJpYW5jZSAxIGFjcm9zcyBhbGwgY2VsbHMsIHJlZ2FyZGxlc3Mgb2YgdGhlIGNvdW50IHZhbHVlcyAoaGlnaCBvciBsb3cpLiBUaGlzIGluIHByYWN0aWNlIGNvcnJlc3BvbmQgdG8gYSAiYmluYXJpemF0aW9uIiBvZiB0aGUgZGF0YSwgb3IgcmF0aGVyICJ0ZXJuYXJpemF0aW9uIiwgd2hlcmUgaW4gZ2VuZXJhbCBmb3IgZWFjaCBjZWxsIGEgZ2VuZSB3aWxsIGJlIHRlbmQgInVwIiAoPjApLCAiYXZlcmFnZSIgKD0wKSBvciAiZG93biIgKDwwKSB3aXRoIHZlcnkgY2xvc2UgdmFsdWVzLiAKYGBge3J9CmFsbC5nZW5lcyA8LSByb3duYW1lcyhsdW5nKQpsdW5nIDwtIFNjYWxlRGF0YShsdW5nLCBmZWF0dXJlcyA9IGFsbC5nZW5lcykKYGBgCgpUaGUgbGlzdCBvZiBTIGFuZCBHMk0gc3BlY2lmaWMgZ2VuZXMgYXJlIGFscmVhZHkgcHJlLWxvYWRlZCBpbiBTZXVyYXQsIGluICJjYy5nZW5lcyIuIFdlIGNhbiBzZWdyZWdhdGUgdGhpcyBsaXN0IGludG8gbWFya2VycyBvZiBHMi9NIHBoYXNlIGFuZCBtYXJrZXJzIG9mIFMgcGhhc2UuCmBgYHtyfQpzLmdlbmVzIDwtIHN0cmluZ3I6OnN0cl90b190aXRsZShjYy5nZW5lcyRzLmdlbmVzKSAjIHN0cmluZ3I6OnN0cl90b190aXRsZSgpOiBmcm8gZXguIGZyb20gIlBDTkEiIHRvICJQY25hIiBzaW5jZSB3ZSBhcmUgd29ya2luZyBvbiBNbSBnZW5lcwpnMm0uZ2VuZXMgPC0gc3RyaW5ncjo6c3RyX3RvX3RpdGxlKGNjLmdlbmVzJGcybS5nZW5lcykKbHVuZyA8LSBDZWxsQ3ljbGVTY29yaW5nKGx1bmcsIHMuZmVhdHVyZXMgPSBzLmdlbmVzLCBnMm0uZmVhdHVyZXMgPSBnMm0uZ2VuZXMsIHNldC5pZGVudCA9IFRSVUUpCmBgYAoKVmlldyBjZWxsIGN5Y2xlIHNjb3JlcyBhbmQgcGhhc2UgYXNzaWdubWVudHMuCmBgYHtyfQpoZWFkKGx1bmdbW11dKQpgYGAKUnVubmluZyBhIFBDQSBvbiBjZWxsIGN5Y2xlIGdlbmVzIHJldmVhbHMgdGhhdCBjZWxscyBkbyBub3Qgc2VwYXJhdGUgZW50aXJlbHkgYnkgcGhhc2UgLT4gbm8gbmVlZCB0byByZWdyZXNzIHRoZW0gb3V0LgpgYGB7cn0KbHVuZyA8LSBSdW5QQ0EobHVuZywgZmVhdHVyZXMgPSBjKHMuZ2VuZXMsIGcybS5nZW5lcykpCkRpbVBsb3QobHVuZykKYGBgCgojIFBlcmZvcm0gbGluZWFyIGRpbWVuc2lvbmFsIHJlZHVjdGlvbgoKVGhlIG5leHQgc3RlcCBpcyB0aGUgZGltZW5zaW9uYWxpdHkgcmVkdWN0aW9uLCBvbiBzY2FsZWQgY291bnRzIGFuZCBvbiB0aGUgMjAwMCBtb3N0IHZhcmlhYmxlIGdlbmVzIGV4dHJhY3RlZCBiZWZvcmUsIHdpdGggUENBLgpgYGB7cn0KbHVuZyA8LSBSdW5QQ0EobHVuZywgZmVhdHVyZXMgPSBWYXJpYWJsZUZlYXR1cmVzKG9iamVjdCA9IGx1bmcpKQpwcmludChsdW5nW1sicGNhIl1dLCBkaW1zID0gMTo1LCBuZmVhdHVyZXMgPSA1KQpWaXpEaW1Mb2FkaW5ncyhsdW5nLCBkaW1zID0gMToyLCByZWR1Y3Rpb24gPSAicGNhIikKYGBgCgpXZSB2aXN1YWxpemUgdGhlIHJlc3VsdHMgb24gdGhlIGZpcnN0IHR3byBQQy4KYGBge3J9CkRpbVBsb3QobHVuZywgcmVkdWN0aW9uID0gInBjYSIpCmBgYAoKV2UgY2FuIHBsb3QgdGhlIGRhdGEgcHJvamVjdGVkIG9uIGFueSBvZiB0aGUgUENBIGRpbWVuc2lvbnMuCmBgYHtyfQpEaW1QbG90KGx1bmcsIHJlZHVjdGlvbiA9ICJwY2EiLCBkaW1zID0gYygzLDQpKQpgYGAKCkhlYXRtYXAgb2YgdGhlIG1vc3QgdmFyaWFibGUgZ2VuZXMgb24gdGhlIG1vc3QgdmFyaWFibGUgY2VsbHMgaW4gdGhlIGZpcnN0IFBDOgpgYGB7cn0KRGltSGVhdG1hcChsdW5nLCBkaW1zID0gMSwgY2VsbHMgPSA1MDAsIGJhbGFuY2VkID0gVFJVRSkKYGBgCgpUaGVyZSBhcmUgc2V2ZXJhbCBnZW5lcyB3aXRoIGEgY2xlYXIgZGlmZmVyZW5jZSBvZiBleHByZXNzaW9uIGFjcm9zcyB0aGUgbW9zdCB2YXJpYWJsZSBjZWxscywgc3BsaXR0aW5nIHRoZW0gaW50byB0d28gbWFpbiBncm91cHMuIFRoZSBzYW1lIGNhbiBiZSBkb25lIGZvciB0aGUgb3RoZXIgUENzLgpgYGB7cn0KRGltSGVhdG1hcChsdW5nLCBkaW1zID0gMTo5LCBjZWxscyA9IDUwMCwgYmFsYW5jZWQgPSBUUlVFKQpgYGAKCiMgRGV0ZXJtaW5lIHRoZSAiZGltZW5zaW9uYWxpdHkiIG9mIHRoZSBkYXRhc2V0CgpOZXh0IHN0ZXAgaXMgdG8gZGV0ZXJtaW5lIHRoZSAiZGltZW5zaW9uYWxpdHkiIG9mIHRoZSBkYXRhc2V0LiAiU2lnbmlmaWNhbnQiIFBDcyBpbiBKYWNrU3RyYXdQbG90KCkgd2lsbCBzaG93IGEgc3Ryb25nIGVucmljaG1lbnQgb2YgZmVhdHVyZXMgd2l0aCBsb3cgcC12YWx1ZXMgKHNvbGlkIGN1cnZlIGFib3ZlIHRoZSBkYXNoZWQgbGluZSkuIApgYGB7cn0KbHVuZyA8LSBKYWNrU3RyYXcobHVuZywgbnVtLnJlcGxpY2F0ZSA9IDEwMCkKbHVuZyA8LSBTY29yZUphY2tTdHJhdyhsdW5nLCBkaW1zID0gMToyMCkKYGBgCmBgYHtyfQpKYWNrU3RyYXdQbG90KGx1bmcsIGRpbXMgPSAxOjIwKQpgYGAKCkFuIGFsdGVybmF0aXZlIGhldXJpc3RpYyBtZXRob2QgZ2VuZXJhdGVzIGFuICJFbGJvdyBwbG90Ii4gSGVyZSB3ZSBjYW4gb2JzZXJ2ZSBhbiAiZWxib3ciIGFyb3VuZCBQQzIwLCBzdWdnZXN0aW5nIHRoYXQgdGhlIG1ham9yaXR5IG9mIHRydWUgc2lnbmFsIGlzIGNhcHR1cmVkIGluIHRoZSBmaXJzdCAyMCBQQ3MuCmBgYHtyfQpFbGJvd1Bsb3QobHVuZykKYGBgCgojIENsdXN0ZXIgdGhlIGNlbGxzCgpTZXR0aW5nIHJlc29sdXRpb24gYmV0d2VlbiAwLjQtMS4yIHR5cGljYWxseSByZXR1cm5zIGdvb2QgcmVzdWx0cyBmb3Igc2luZ2xlLWNlbGwgZGF0YXNldHMgb2YgYXJvdW5kIDNLIGNlbGxzLiBPcHRpbWFsIHJlc29sdXRpb24gb2Z0ZW4gaW5jcmVhc2VzIGZvciBsYXJnZXIgZGF0YXNldHMuIFRoZSBjbHVzdGVycyBjYW4gYmUgZm91bmQgdXNpbmcgdGhlIElkZW50cyBmdW5jdGlvbi4KYGBge3J9Cmx1bmcgPC0gRmluZE5laWdoYm9ycyhsdW5nLCBkaW1zID0gMToyMCkKbHVuZyA8LSBGaW5kQ2x1c3RlcnMobHVuZywgcmVzb2x1dGlvbiA9IDAuNSkKaGVhZChJZGVudHMobHVuZyksIDUpCmBgYAoKTm93IHRoZSBQQ0EgcGxvdCBpcyBsYWJlbGVkIHdpdGggY2x1c3RlcnMuCmBgYHtyfQpEaW1QbG90KGx1bmcsIHJlZHVjdGlvbiA9ICJwY2EiKQpgYGAKCiMgUnVuIG5vbi1saW5lYXIgZGltZW5zaW9uYWwgcmVkdWN0aW9uIAoKVU1BUDoKYGBge3J9CiNyZXRpY3VsYXRlOjpweV9pbnN0YWxsKHBhY2thZ2VzID0ndW1hcC1sZWFybicpCmx1bmcgPC0gUnVuVU1BUChsdW5nLCBkaW1zID0gMToyMCkKRGltUGxvdChsdW5nLCByZWR1Y3Rpb24gPSAidW1hcCIpCmBgYAoKV2UgdmlzdWFsaXplIGhlcmUgYSB0YWJsZSBjb250YWluaW5nIChpZGVudGlmaWVkIGJ5IG51bWJlcikgdGhlIG51bWJlciBvZiBjZWxscyBiZWxvbmdpbmcgdG8gZWFjaCBjbHVzdGVyLCB1c2VmdWwgdG8gYmUgY29tcGFyZWQgdG8gdGhlICJjbHVzdGVyIHRhYmxlIiBvbiBQYW5nbGFvREIgKGh0dHBzOi8vcGFuZ2xhb2RiLnNlL2xpc3RfY2x1c3RlcnNfYW5kX2NlbGxfdHlwZXMuaHRtbD9zcmE9U1JBNjUzMTQ2JnNycz1TUlMzMDQ0MjYxKS4KYGBge3J9CnRhYmxlKElkZW50cyhsdW5nKSkKYGBgCgpMZXQncyBzYXZlIHRoZSBvYmplY3QuCmBgYHtyfQpzYXZlUkRTKGx1bmcsIGZpbGUgPSAibHVuZy5yZHMiKQpgYGAKCiMgRmluZGluZyBkaWZmZXJlbnRpYWxseSBleHByZXNzZWQgZmVhdHVyZXMgKGNsdXN0ZXIgYmlvbWFya2VycykKClRoZSBmaW5hbCBzdGVwIGlzIHRvIGdpdmUgYW4gImlkZW50aXR5IiB0byB0aGUgY2x1c3RlcnMuIFRoYXQgaXMsIGZpbmQgd2hpY2ggYXJlIHRoZSAibWFya2VyIGdlbmVzIiAoZXhwcmVzc2VkIGV4Y2x1c2l2ZWx5LCBvciBhdCBsZWFzdCBvdmVyLWV4cHJlc3NlZCkgaW4gZWFjaCBjbHVzdGVyIHdpdGggcmVzcGVjdCB0byB0aGUgb3RoZXJzLiBUaGVuLCB0cnlpbmcgdG8gZmlndXJlIG91dCwgYWNjb3JkaW5nIHRvIHRoZSBtYXJrZXIgZ2VuZXMgb2YgZWFjaCBjbHVzdGVyLCB3aGF0IGNvdWxkIGJlIHRoZSBjb3JyZXNwb25kaW5nIGNlbGwgdHlwZS4KCldlIGNhbiBmaW5kIHRoZSAiZGlmZmVyZW50aWFsbHkgZXhwcmVzc2VkIGdlbmVzIiBmb3IgYSBjbHVzdGVyIGFnYWluc3QgdGhlIG90aGVycywgd2l0aCB0aGUgYWRkaXRpb25hbCBjb25kaXRpb24gdGhhdCB0aGUgZ2VuZSBoYXMgdG8gYmUgZXhwcmVzc2VkIGluIGF0IGxlYXN0IHRoZSAyNSUgb2YgdGhlIGNlbGxzIGluIHRoZSBjbHVzdGVyOgpgYGB7cn0KY2x1c3RlcjEubWFya2VycyA8LSBGaW5kTWFya2VycyhsdW5nLCBpZGVudC4xID0gMSwgbWluLnBjdCA9IDAuMjUpCmBgYApgYGB7cn0KaGVhZChjbHVzdGVyMS5tYXJrZXJzLCBuID0gNSkKYGBgCgpPciBjaG9vc2UgdG8gY29tcGFyZSBleHBsaWNpdGx5IHR3byBvciBtb3JlIGNsdXN0ZXJzLiBIZXJlIGNsdXN0ZXIgNSBpcyBjb21wYXJlZCB0byBjbHVzdGVycyAwIGFuZCAzLgpgYGB7cn0KY2x1c3RlcjUubWFya2VycyA8LSBGaW5kTWFya2VycyhsdW5nLCBpZGVudC4xID0gNSwgaWRlbnQuMiA9IGMoMCwgMyksIG1pbi5wY3QgPSAwLjI1KQpgYGAKYGBge3J9CmhlYWQoY2x1c3RlcjUubWFya2VycywgbiA9IDUpCmBgYAoKT3IsIGZpbmQgdGhlIG1hcmtlcnMgb2YgYWxsIGNsdXN0ZXJzLCB3aXRoIHJlc3BlY3QgdG8gdGhlIG90aGVycyAodGhpcyB3aWxsIHRha2UgbG9uZ2VyKS4gVGhlIHJlc3VsdHMgYXJlIHN0b3JlZCBpbiBhIHRhYmxlIHdpdGggdHdvIGdlbmVzIGZvciBlYWNoIGNsdXN0ZXIuIEhlcmUgdGhlIGNyaXRlcmlhIGFyZSBhKSBleHByZXNzZWQgaW4gYXQgbGVhc3QgMjUlIG9mIHRoZSBjZWxscyBhbmQgYikgd2l0aCBhIGxvZzIgZm9sZCBjaGFuZ2Ugb2YgYXQgbGVhc3QgMC4yNS4KYGBge3J9Cmx1bmcubWFya2VycyA8LSBGaW5kQWxsTWFya2VycyhsdW5nLCBvbmx5LnBvcyA9IFRSVUUsIG1pbi5wY3QgPSAwLjI1LCBsb2dmYy50aHJlc2hvbGQgPSAwLjI1KQpgYGAKYGBge3J9Cmx1bmcubWFya2VycyAlPiUgZ3JvdXBfYnkoY2x1c3RlcikgJT4lIHRvcF9uKG4gPSAyLCB3dCA9IGF2Z19sb2cyRkMpCmBgYAoKClRoZXJlIGFyZSBzZXZlcmFsIHRvb2xzIGZvciB2aXN1YWxpemluZyBtYXJrZXIgZXhwcmVzc2lvbi4gVmxuUGxvdCgpIChzaG93cyBleHByZXNzaW9uIHByb2JhYmlsaXR5IGRpc3RyaWJ1dGlvbnMgYWNyb3NzIGNsdXN0ZXJzKSwgYW5kIEZlYXR1cmVQbG90KCkgYXJlIG1vc3QgY29tbW9ubHkgdXNlZCB2aXN1YWxpemF0aW9ucy4gVGhlcmUgYXJlIGFsc28gUmlkZ2VQbG90KCkgYW5kIERvdFBsb3QoKSBhcyBhZGRpdGlvbmFsIG1ldGhvZHMgdG8gdmlldyBvdXIgZGF0YXNldC4KYGBge3J9CiMgVmlvbGluIHBsb3Qgb2YgdGhlIGV4cHJlc3Npb24gb2YgdGhlIGJlc3QgIm1hcmtlciBnZW5lIiBmb3IgZWFjaCBjbHVzdGVyOgojIFZsblBsb3QobHVuZywgZmVhdHVyZXMgPSBmZWF0dXJlcyA9IGMoIlNsYzdhMTAiLCAiR3ptYSIsICJSYnAxIiwgIlBpMTYiLCAiVmNhbTEiLCAiTXVzdG4xIiwgIkNveDRpMiIsICJFYXIyIiwgIk1zNGE2YyIsICJJZ2tjIiwgIlNmdHBhMSIsICJBazciLCAiQXFwMSIsICJSYWMyIikpIHNwbGl0dGVkIGludG8gKGZvciBiZXR0ZXIgdmlzdWFsaXphdGlvbik6ClZsblBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJTbGM3YTEwIiwgIkd6bWEiLCAiUmJwMSIpKQpWbG5QbG90KGx1bmcsIGZlYXR1cmVzID0gYygiUGkxNiIsICJWY2FtMSIsICJNdXN0bjEiKSkKVmxuUGxvdChsdW5nLCBmZWF0dXJlcyA9IGMoIkNveDRpMiIsICJFYXIyIiwgIk1zNGE2YyIpKSAKVmxuUGxvdChsdW5nLCBmZWF0dXJlcyA9IGMoIklna2MiLCAiU2Z0cGExIiwgIkFrNyIpKSAKVmxuUGxvdChsdW5nLCBmZWF0dXJlcyA9ICJBcXAxIikgClZsblBsb3QobHVuZywgZmVhdHVyZXMgPSAiUmFjMiIpICMgVW5rbm93bgojRmVhdHVyZVBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJTbGM3YTEwIiwgIkdhdGEzIiwgIkd6bWEiLCAiQ2NsNSIsICJJZm5nIiwgIlJicDEiLCAiUGkxNiIsICJWY2FtMSIsICJNdXN0bjEiLCAiQWN0YTIiLCAiQ294NGkyIiwgIkhpZ2QxYiIsICJDaGlsMyIsICJFYXIxIiwgIkVhcjIiLCAiTXM0YTZjIiwgIkNjcjciLCAiSWdrYyIsICJTZnRwYTEiLCAiU2Z0cGMiLCAiU2Z0YTIiLCAiRWdmbDYiLCAiTGFtcDMiLCAiQWs3IiwgIkFxcDEiLCAiQmNhbSIsICJSYWMyIiwgIkF0ZjMiKSkgc3BsaXR0ZWQgaW50byAoZm9yIGJldHRlciB2aXN1YWxpemF0aW9uKToKRmVhdHVyZVBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJTbGM3YTEwIiwgIkdhdGEzIiwgIkd6bWEiLCAiQ2NsNSIpKQpGZWF0dXJlUGxvdChsdW5nLCBmZWF0dXJlcyA9IGMoIklmbmciLCAiUmJwMSIsICJQaTE2IiwgIlZjYW0xIikpCkZlYXR1cmVQbG90KGx1bmcsIGZlYXR1cmVzID0gYygiTXVzdG4xIiwgIkFjdGEyIiwgIkNveDRpMiIsICJIaWdkMWIiKSkKRmVhdHVyZVBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJDaGlsMyIsICJFYXIxIiwgIkVhcjIiLCAiTXM0YTZjIikpCkZlYXR1cmVQbG90KGx1bmcsIGZlYXR1cmVzID0gYygiQ2NyNyIsICJJZ2tjIiwgIlNmdHBhMSIsICJTZnRwYyIpKQpGZWF0dXJlUGxvdChsdW5nLCBmZWF0dXJlcyA9IGMoIlNmdGEyIiwgIkVnZmw2IiwgIkxhbXAzIiwgIkFrNyIpKQpGZWF0dXJlUGxvdChsdW5nLCBmZWF0dXJlcyA9IGMoIkFxcDEiLCAiQmNhbSIsICJSYWMyIiwgIkF0ZjMiKSkKUmlkZ2VQbG90KGx1bmcsIGZlYXR1cmVzID0gYygiR3ptYSIsICJFYXIyIiksIG5jb2wgPSAyKSAjIFJpZGdlIHBsb3RzIGZyb20gZ2dyaWRnZXMuIFZpc3VhbGl6ZSBzaW5nbGUgY2VsbCBleHByZXNzaW9uIGRpc3RyaWJ1dGlvbnMgaW4gZWFjaCBjbHVzdGVyCkRvdFBsb3QobHVuZywgZmVhdHVyZXMgPSBjKCJTbGM3YTEwIiwgIkdhdGEzIiwgIkd6bWEiLCAiQ2NsNSIsICJJZm5nIiwgIlJicDEiLCAiUGkxNiIsICJWY2FtMSIsICJNdXN0bjEiLCAiQWN0YTIiLCAiQ294NGkyIiwgIkhpZ2QxYiIsICJDaGlsMyIsICJFYXIxIiwgIkVhcjIiLCAiTXM0YTZjIiwgIkNjcjciLCAiSWdrYyIsICJTZnRwYTEiLCAiU2Z0cGMiLCAiU2Z0YTIiLCAiRWdmbDYiLCAiTGFtcDMiLCAiQWs3IiwgIkFxcDEiLCAiQmNhbSIsICJSYWMyIiwgIkF0ZjMiLCAiSXJmOCIpKSArIFJvdGF0ZWRBeGlzKCkgIyB0aGUgc2l6ZSBvZiB0aGUgZG90IGNvcnJlc3BvbmRzIHRvIHRoZSBwZXJjZW50YWdlIG9mIGNlbGxzIGV4cHJlc3NpbmcgdGhlIGZlYXR1cmUgaW4gZWFjaCBjbHVzdGVyLiBUaGUgY29sb3IgcmVwcmVzZW50cyB0aGUgYXZlcmFnZSBleHByZXNzaW9uIGxldmVsCmBgYAoKRG9IZWF0bWFwKCkgZ2VuZXJhdGVzIGFuIGV4cHJlc3Npb24gaGVhdG1hcCBmb3IgZ2l2ZW4gY2VsbHMgYW5kIGZlYXR1cmVzLiBJbiB0aGlzIGNhc2UsIHdlIGFyZSBwbG90dGluZyB0aGUgdG9wIDEwIG1hcmtlcnMgZm9yIGVhY2ggY2x1c3Rlci4KYGBge3J9Cmx1bmcubWFya2VycyAlPiUKICAgIGdyb3VwX2J5KGNsdXN0ZXIpICU+JQogICAgdG9wX24obiA9IDEwLCB3dCA9IGF2Z19sb2cyRkMpIC0+IHRvcDEwCkRvSGVhdG1hcChsdW5nLCBmZWF0dXJlcyA9IHRvcDEwJGdlbmUpICsgTm9MZWdlbmQoKQpgYGAKCiMgQXNzaWduaW5nIGNlbGwgdHlwZSBpZGVudGl0eSB0byBjbHVzdGVycwoKYGBge3J9Cm5ldy5jbHVzdGVyLmlkcyA8LSBjKCJGaWJyb2JsYXN0cyIsICJOSyIsICJGaWJyb2JsYXN0cyIsICJGaWJyb2JsYXN0cyIsICJGaWJyb2JsYXN0cyIsICJTTUNzIiwgIlBlcmljeXRlcyIsICJBbHZlb2xhciBtLiIsICJNYWNyb3BoYWdlcyIsICJCIiwgIlBuZXVtb2N5dGVzSUkiLCAiRXBlbmR5bWFsIiwgIkVuZG90aGVsaWFsIiwgIlVua25vd24iKQpuYW1lcyhuZXcuY2x1c3Rlci5pZHMpIDwtIGxldmVscyhsdW5nKQpsdW5nIDwtIFJlbmFtZUlkZW50cyhsdW5nLCBuZXcuY2x1c3Rlci5pZHMpCkRpbVBsb3QobHVuZywgcmVkdWN0aW9uID0gInVtYXAiLCBsYWJlbCA9IFRSVUUsIHB0LnNpemUgPSAwLjUpICsgTm9MZWdlbmQoKQpzYXZlUkRTKGx1bmcsIGZpbGUgPSAibHVuZ19maW5hbC5yZHMiKQpgYGAKU2luZ2xlIGNlbGwgaGVhdG1hcCBzaG93aW5nIHRoZSBleHByZXNzaW9uIG9mIGJlc3Qg4oCcbWFya2VyIGdlbmXigJ0gZm9yIGVhY2ggY2x1c3RlciBvdmVybGF5ZWQgb24gdGhlIFVNQVAgcHJvamVjdGlvbiBvZiB0aGUgY2VsbHMuCgpgYGB7cn0KRG9IZWF0bWFwKHN1YnNldChsdW5nLCBkb3duc2FtcGxlID0gMTAwKSwgZmVhdHVyZXMgPSBjKCJTbGM3YTEwIiwgIkd6bWEiLCAiUmJwMSIsICJQaTE2IiwgIlZjYW0xIiwgIk11c3RuMSIsICJDb3g0aTIiLCAiRWFyMiIsICJNczRhNmMiLCAiSWdrYyIsICJTZnRwYTEiLCAiQWs3IiwgIkFxcDEiLCAiUmFjMiIpLCBzaXplID0gMykKYGBgCgojIHNlc3Npb25JbmZvKCkKClRoaXMgYW5hbHlzaXMgd2FzIGNvbmR1Y3RlZCBvbjoKYGBge3J9CnNlc3Npb25JbmZvKCkKYGBgCgojIEJpYmxpb2dyYXBoeQoKSGFvLCBIYW8sIGV0IGFsLiwgQ2VsbCAyMDIxOyBTZXVyYXQgVjQ7IGh0dHBzOi8vc2F0aWphbGFiLm9yZy9zZXVyYXQvCgpHaXVsaW8gUGF2ZXNpOyBTZXVyYXQgZXhhbXBsZSBvbiAxMHggRGF0YSAoMjAyMSB1cGRhdGUpOyBodHRwOi8vMTU5LjE0OS4xNjAuNTYvVHJhbnNjcmlwdG9taWNzL3NldXJhdC5odG1s